从C# 3.0开始提供了Distinct方法,这对于集合的使用有了更为丰富的方法,经过在网上搜索相应的资源,发现有关这方面的写的好的文章还是不少的。而且为了扩展Linq的使用不方便的地方,有一些办法非常有效。由于本人工作中的需要,有一些功能暂时没有用到那么深入,现在只把最简单的一些功能分享出来,整理出来。
简单一维集合的使用:
List ages = new List { 21, 46, 46, 55, 17, 21, 55, 55 };
List names = new List { "wang", "li", "zhang", "li", "wang", "chen", "he", "wang" };
IEnumerable distinctAges = ages.Distinct();
Console.WriteLine("Distinct ages:");
foreach (int age in distinctAges)
{
Console.WriteLine(age);
}
var distinctNames = names.Distinct();
Console.WriteLine("\nDistinct names:");
foreach (string name in distinctNames)
{
Console.WriteLine(name);
}
在这段代码中,是最简单的Distinct()方法的使用。使用了集合接口IEnumerable,以及隐式类型var,至于这两种用法有什么区别,没有研究出来。但是如果象下面这样的代码,是错误的!
List disAge = ages.Distinct();
正确的方法应该是:
List ages = new List { 21, 46, 46, 55, 17, 21, 55, 55 };
List disAge = ages.Distinct().ToList();
foreach (int a in disAge)
Console.WriteLine(a);
也就是说Distinct()方法的返回集合类型是一个接口,不是具体的集合,所以需要用一个ToList()。 自定义类的使用:
首先我们看MSDN上给出的例子,先定义一个产品类:
public class Product : IEquatable
{
public string Name { get; set; }
public int Code { get; set; }
public bool Equals(Product other)
{
//Check whether the compared object is null.
if (Object.ReferenceEquals(other, null)) return false;
//Check whether the compared object references the same data.
if (Object.ReferenceEquals(this, other)) return true;
//Check whether the products' properties are equal.
return Code.Equals(other.Code) && Name.Equals(other.Name);
}
// If Equals() returns true for a pair of objects
// then GetHashCode() must return the same value for these objects.
public override int GetHashCode()
{
//Get hash code for the Name field if it is not null.
int hashProductName = Name == null ? 0 : Name.GetHashCode();
//Get hash code for the Code field.
int hashProductCode = Code.GetHashCode();
//Calculate the hash code for the product.
return hashProductName ^ hashProductCode;
}
}
在主函数里,是这样用的:
static void Main(string[] args)
{
Product[] products =
{
new Product { Name = "apple", Code = 9 },
new Product { Name = "orange", Code = 4 },
new Product { Name = "apple", Code = 9 },
new Product { Name = "lemon", Code = 12 }
};
//Exclude duplicates.
IEnumerable noduplicates =
products.Distinct();
foreach (var product in noduplicates)
Console.WriteLine(product.Name + " " + product.Code);
}
这样的输出是:
/*
This code produces the following output:
apple 9
orange 4
lemon 12
*/
但是现在的问题是,如果我们把主函数里改成这样:
static void Main(string[] args)
{
Product[] products =
{
new Product { Name = "Smallapple", Code = 9 },
new Product { Name = "orange", Code = 4 },
new Product { Name = "Bigapple", Code = 9 },
new Product { Name = "lemon", Code = 12 }
};
//Exclude duplicates.
IEnumerable noduplicates =
products.Distinct();
foreach (var product in noduplicates)
Console.WriteLine(product.Name + " " + product.Code);
}
这样的输出是:
/*
This code produces the following output:
Smallapple 9
orange 4
Bigapple 9
lemon 12
*/
我们的问题是,如果想按Code来索引,想找出Code唯一的这些成员,那么这里就需要重新定义一个对Code比较的类,或者再扩展成泛型类,但是这样非常繁琐。 博客鹤冲天的改进办法(以下均转自这个博客)
首先,创建一个通用比较的类,实现IEqualityComparer接口:
public class CommonEqualityComparer : IEqualityComparer
{
private Func keySelector;
public CommonEqualityComparer(Func keySelector)
{
this.keySelector = keySelector;
}
public bool Equals(T x, T y)
{
return EqualityComparer.Default.Equals(keySelector(x), keySelector(y));
}
public int GetHashCode(T obj)
{
return EqualityComparer.Default.GetHashCode(keySelector(obj));
}
}
借助上面这个类,Distinct扩展方法就可以这样写:
public static class DistinctExtensions
{
public static IEnumerable Distinct(this IEnumerable source, Func keySelector)
{
return source.Distinct(new CommonEqualityComparer(keySelector));
}
}
下面的使用就很简单了:
Product[] products =
{
new Product { Name = "Smallapple", Code = 9 },
new Product { Name = "orange", Code = 4 },
new Product { Name = "Bigapple", Code = 9 },
new Product { Name = "lemon", Code = 12 }
};
var p1 = products.Distinct(p => p.Code);
foreach (Product pro in p1)
Console.WriteLine(pro.Name + "," + pro.Code);
var p2 = products.Distinct(p => p.Name);
foreach (Product pro in p2)
Console.WriteLine(pro.Name + "," + pro.Code);
可以看到,加上Linq表达式,可以方便的对自定义类的任意字段进行Distinct的处理。
|